Chapter 6 Convolutional Neural Network (ConvNet)

Table of Contents

Architecture of ConvNet

The representative methods for feature extraction include SIFT, HoG, Textons, Spin image, RIFT, and GLOH.
The feature extraction neural network consists of piles of the convolutional layer and pooling layer pairs. The operations of the convolution and pooling layers are conceptually in a two-dimensional plane. This is one of the differences between ConvNet and other neural networks.

Convolution Layer

weight --> 2D convolutional kernel
weighted sum--> 2D convolution

Vertical and horizontal edge detection filters

Sobel filter
design filters--> learning filters with deep learning
image = imread('cameraman.tif');
w1 = [0 -1 0; -1 4 -1; 0 -1 0]; % Laplacian filter
w2 = ones(3,3)/9; % Laplacian filter
filteredImage1 = imfilter(image,w1,'same','corr');
filteredImage2 = imfilter(image,w2,'same','corr');
imshow([image filteredImage1 filteredImage2])
Padding in convolution
Strided convolution
Convolutions on RGB images
Example of a layer
fully connected network:
backpropogation:
convolution network:
backpropogation:
Assume h is 5x5 filter, then we define . Note that it is correlation but not the classical convolution.
https://www.cnblogs.com/pinard/p/6494810.html
卷积:稀疏交互(或稀疏连接,稀疏权重)与参数共享

Pooling Layer

The pooling layer compensates for eccentric and tilted objects to some extent. For example, the pooling layer can improve the recognition of a cat, which may be off-center in the input image. In addition, as the pooling process reduces the image size, it is highly beneficial for relieving the computational load and preventing overfitting.
Backpropagation of pooling layer
meanpooling
maxpooling

Example: MNIST

The training data is the MNIST database, which contains 70,000 images of handwritten numbers. In general, 60,000 images are used for training, and the remaining 10,000 images are used for the validation test. Each digit image is a 28-by-28 pixel black-and-white image.
% TestMnistConv.m
Images = loadMNISTImages('t10k-images.idx3-ubyte');
Images = reshape(Images, 28, 28, []);
Labels = loadMNISTLabels('t10k-labels.idx1-ubyte');
Labels(Labels == 0) = 10; % 0 --> 10
rng(1);
% Learning
W1 = 1e-2*randn([9 9 20]);% 20 convolution kernels
W5 = (2*rand(100, 2000) - 1) * sqrt(6) / sqrt(100 + 2000);
Wo = (2*rand( 10, 100) - 1) * sqrt(6) / sqrt( 10 + 100);
X = Images(:, :, 1:8000);
D = Labels(1:8000);
montage(X(:,:,1:49))
for epoch = 1:10
epoch
[W1, W5, Wo] = MnistConv(W1, W5, Wo, X, D);
end
epoch = 1
epoch = 2
epoch = 3
epoch = 4
epoch = 5
epoch = 6
epoch = 7
epoch = 8
epoch = 9
epoch = 10
save('MnistConv.mat');
% Test
%
X = Images(:, :, 8001:10000);
D = Labels(8001:10000);
acc = 0;
N = length(D);
for k = 1:N
x = X(:, :, k); % Input, 28x28
y1 = Conv(x, W1); % Convolution, 20x20x20
y2 = ReLU(y1); %
y3 = Pool(y2); % Pool, 10x10x20
y4 = reshape(y3, [], 1); % 2000x1
v5 = W5*y4; % ReLU, 100x1
y5 = ReLU(v5); %
v = Wo*y5; % Softmax, 10x1
y = Softmax(v); %
[~, i] = max(y);
if i == D(k)
acc = acc + 1;
end
end
acc = acc / N;
fprintf('Accuracy is %f\n', acc);
Accuracy is 0.973500
plot features
% plot features
load('MnistConv.mat')
k = 2;
x = X(:, :, k);
y1 = Conv(x, W1); % Convolution, 20x20x20
y2 = ReLU(y1); %
y3 = Pool(y2); % Pool, 10x10x20
y4 = reshape(y3, [], 1); % 2000
v5 = W5*y4; % ReLU, 100
y5 = ReLU(v5); %
v = Wo*y5; % Softmax, 10
y = Softmax(v); %
figure;
display_network(x(:));
title('Input Image')
convFilters = zeros(9*9, 20);
for i = 1:20
filter = W1(:, :, i);
convFilters(:, i) = filter(:);
end
figure
display_network(convFilters);
title('Convolution Filters')
fList = zeros(20*20, 20);
for i = 1:20
feature = y1(:, :, i);
fList(:, i) = feature(:);
end
figure
display_network(fList);
fList = zeros(20*20, 20);
for i = 1:20
feature = y2(:, :, i);
fList(:, i) = feature(:);
end
figure
display_network(fList);
title('Features [Convolution]')
fList = zeros(20*20, 20);
for i = 1:20
feature = y2(:, :, i);
fList(:, i) = feature(:);
end
figure
display_network(fList);
title('Features [Convolution + ReLU]')
fList = zeros(10*10, 20);
for i = 1:20
feature = y3(:, :, i);
fList(:, i) = feature(:);
end
figure
display_network(fList);
title('Features [Convolution + ReLU + MeanPool]')